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ABSTRACT 

The limiting distribution /i of the normalized number of key comparisons required by the 
Quicksort sorting algorithm is known to be the unique fixed point of a certain distributional 
transformation T — unique, that is, subject to the constraints of zero mean and finite variance. 
We show that a distribution is a fixed point of T if and only if it is the convolution of fi with a 
Cauchy distribution of arbitrary center and scale. In particular, therefore, \x is the unique fixed 
point of T having zero mean. 
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1 Introduction, motivation, and summary 

Let A4 denote the class of all probability measures (distributions) on the real line. This 
paper concerns the transformation T defined on M by letting Tv be the distribution of 

UZ + (l-U)Z* + g(U), 

where U, Z, and Z* are independent, with Z ~ Z* ~ v, and U ~ unif(0, 1), and 
where 

(1.1) g(u) := 2u In u + 2(1 - u) ln(l - u) + 1. 

Of course, T can be regarded as a transformation on the class of characteristic func- 
tions ip of elements of A4. With this interpretation, T takes the form 

(TV>)(i) = E[i/>(Ut) - U)t) exp[itg(U)]] = [ ip(ut) i/t((l - u)t) e it9 ^ du, t e R. 

Ju=0 

It is well known Q that (i) among distributions with zero mean and finite variance, 
T has a unique fixed point, call it fi; and (ii) if C n denotes the random number of key 
comparisons required by the algorithm Quicksort to sort a file of n records, then the 
distribution of (C n — EC n )/n converges weakly to /j,. 

There are other fixed points. For example, it has been noted frequently in the 
literature that the location family generated by \x is a family of fixed points. But there 
are many more fixed points, as we now describe. Define the Cauchy(m, a) distribution 
(where m € R and a > 0) to be the distribution of m + aC, where C has the standard 
Cauchy distribution with density x i— > [tt(1 + x 2 )] -1 , x G R; equivalently, Cauchy(m, cr) 
is the distribution with characteristic function ^ imt ^ (J \ t \ . p n particular, the Cauchy(m, 0) 
distribution is unit mass at m.] Now let T denote the class of all fixed points of T, 
and let C denote the class of convolutions of \jl with a Cauchy distribution. Using 
characteristic functions it is easy to check that C C J 7 , and that all of the distributions 
in C are distinct. In this paper we will prove that, conversely, J- C C, and thereby 
establish the following main result. 

Theorem 1.1. The class T equals C. That is, a measure v is a fixed point of the 
Quicksort transformation T if and only if it is the convolution of the limiting Quicksort 
measure \jl with a Cauchy distribution of arbitrary center m and scale a. In particular, 
T is in one-to-one correspondence with the set {(m,o~) : m G R, a > 0}. 

The following corollary is immediate and strengthens Rosler's characterization 
of [i as the unique element of T having zero mean and finite variance. 

Corollary 1.2. The limiting Quicksort measure jj, is the unique fixed point of the 
Quicksort transformation T having finite expectation equal to 0. □ 

The present paper can be motivated in two ways. First, the authors are writing a 
series of papers refining and extending Rosler's Q probabilistic analysis of Quicksort. 
No closed- form expressions are known for any of the standard functionals (e.g., charac- 
teristic function, distribution function, density function) associated with //; information 
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to be obtained about fi must be read from the fixed-point identity it satisfies. We were 
curious as to what extent additional known information about fj,, such as the fact that 
it has everywhere finite moment generating function, must be brought to bear. As one 
example, it is believed that the continuous Lebesgue density / (treated in Q) for fi 
decays at least exponentially quickly to at ±00, cf. ||. But we now know from The- 
orem 1.1 that there there can be no proof for this conjecture that solely makes use of 
the information that /iGf, 

Second, we view the present paper as a pilot study of fixed points for a class of 
distributional transformations on the line. In the more general setting, we would be 
given (the joint distribution of) a sequence (Ai : i > 0) of random variables and would 
define a transformation T on A4 by letting Tv be the distribution of A$ + AiZi, 
where Zi, Z2, . . . are independent random variables with distribution v. [To ensure well- 
definedness, one might (for example) require that (almost surely) Ai 7^ for only finitely 
many values of i.] For probability measures v on [0, 00), rather than on R, and with 
the additional restrictions that Aq = and Ai > for all i > 1, such transformations 
are called generalized smoothing transformations. These have been thoroughly studied 
by Durrett and Liggett Q, Guivarc'h and Liu Q, and by other authors; consult 
the three papers we have cited here for further bibliographic references. Generalized 
smoothing transformations have applications to interacting particle systems, branching 
processes and branching random walk, random set constructions, and statistical turbu- 
lence theory. The arguments used to characterize the set of fixed points for generalized 
smoothing transformations make heavy use of Laplace transforms; unfortunately, these 
arguments do not carry over readily to distributions on the line. Other authors (see, 
e -g-i [H @ lHH) have treated fixed points of transformations of measures v on the whole 
line as discussed above, but not without finiteness conditions on the moments of v. 



We now outline our proof of Theorem 1.1, Let ip be the characteristic function of 



a given v £ J-, and let r(t) := -0(t) — 1, t G R. In Section |2| we establish and solve 
(in a certain sense) an integral equation satisfied by r. In Section ^ we then use the 
method of successive substitutions to derive asymptotic information about r(t) as t j 0, 
showing first that r(t) = 0(i 2 / 3 ), next that r(t) = f3t + o(t) for some (3 = —a + im € C 
with a > 0, and finally that r(t) = /3t + 0(t 2 ) . In Section g we use this information to 
argue that there exist random variables Z\ ~ v , Z2 ~ /i, and C ~ Cauchy(m, a) such 
that Z\ = Z2 + C. We finish the proof by showing that one can take Z2 and C to be 
independent, whence v € C. 



2 An integral equation 

Let ip denote the characteristic function of a given v G J-. Since ip(—t) = ip(t), we shall 
only need to consider ip(t) for t > 0. For notational convenience, define 

r(t) := ip(t) - 1, t > 0. 

Rearranging the fixed-point integral equation (Tip)(t) = ip(t), we obtain the following 
result. 
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Lemma 2.1. The function r satisfies the integral equation 



where 



(2.1) b(t) := f 

Ju=< 



r(t) = 2 r(ut)du + b(t), t>0, 

Ju=0 



r(ut) r((l - u)t) du + it / [tp(ut) ip((l - u)t) - 1] g(u) du + a(t) 

u=0 



with 
(2.2) 



a(t) := / ip(ut) ^((1 - u)t) [e itg ^ - 1 - itg(u)] du 

Ju=0 



<\^{u)e = a-W)t\ 



□ 



Note that r and b are continuous on [0, do), with r(0) = = 6(0). Regarding b as 
"known", the integral equation in Lemma |2.1| is easily "solved" for r: 



Proposition 2.2. For some constant c G C, we have 

r(t) 



t 



c-2 



1 &(«) , &(t) 
-y^cit; + -V^, i>0. 



Proof. Setting h(t) := t[r(t) — b(t)], Lemma implies 



h(t) = 2 



* v h{v) 



v=0 



+ b{v) 



dv, t > 0. 



Thus /i is continuously differentiable on (0, oo) and satisfies the differential equation 

h'(t) = \h{t) + 2b(t) 
there. This is an easy differential equation to solve for h, and we find that 

h{t) = ct 2 - 2t 2 f ^ dv, t > 0, 

Jv=t v 

for some c G C. After rearrangement, the proposition is proved. □ 



3 Behavior of r near 

We now proceed in stages, using Proposition as our basic tool, to get ever more 
information about the behavior of r (especially near 0). 

Lemma 3.1. Let ip = 1 + r denote the characteristic function of a given v G T ' . Then 
there exists a constant C < oo such that 

\r{t)\ < Ct 2 l z for all t > 0. 
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Proof. Let 

M(t) := max{|r(s)| : < s < t} < 2, t > 0. 



From fl2.l|) and (|2.2|) , we see immediately that, for < t < 1, 

|&(t)| <M 2 (t) + 0(t). 
Therefore, for < t < 1, Proposition ^ yields 

|r(t)| < M 2 (t) + 2/ 

where 



■ / p cfo + e(t) = M 2 (t) + 2 / M 2 (t/u) du + e(t), 



e(t)=0(tlog(I))+0(t) = 0(t 2 / 3 ). 
Consequently, again for < t < 1 (but then trivially for all t > 0), 

M(t) < M 2 (t) + 2 / M 2 {t/u) du + <3(t 2/3 ). 
7m=o 

Fix < a < 1; later in the proof we shall see that a = 1/8 suffices for our purposes. 
Since M(t) — > as t — ► 0, we can choose to > such that M(to) < a. Then, for 
< t < t , 

rt/to rl 
M(t) < M 2 (t) + 2 / M 2 (t/u)du + 2 j M 2 (t/u) du + 0(t 2/3 ) 

J u=0 Ju=t/to 
t I' 1 

< aM(t) + 8- + 2a / M(t/u) du + <3(t 2/3 ) 

to Ju=t/t 



and thus 



M(t)<-^- I M{t/u)du + 0{t 2 ^). 
1 — a 



lu=0 

Since M is bounded, this is trivially true also for t > to- Summarizing, for some constant 
C < oo we have, with U ~ unif(0, 1), 

(3.1) M(t) < —^—BM(t/U) + Ct 2/3 , t > 0. 

1 — a 

Now fix the value of a to be any number in (0, 1/7), say a = 1/8. Then a straight- 
forward induction [substituting (|3.2|) into ( jO| ) for the induction step] shows that for 
any nonnegative integer n we have, for all t > 0, 

< 3 - 2 > m ws(i^)" em (^) + H« 2/s 

Recalling that M is bounded and letting n — > oo, we obtain the desired conclusion, with 
C := £%C. □ 
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Lemma 3.2. Let ip = 1 + r denote the characteristic function of a given v £ T , and 



define b by (2.1). Then 

r(t) = (c - 2J)t + o(t) as t [ 0, 
where J is the absolutely convergent integral 

(3.3) J=f*5>*. 



u=0 



Proof. Combining ( |2.1[ )-( ^2[ ) and Lemma 3.1, we obtain 



\b(t)\ < 0(i 4/3 ) + 0(t 1+(2/3) ) + 0(t 2 ) = 0(t 4/3 ). 



Thus the integral J converges absolutely, and from Proposition 2.2 we obtain the desired 



conclusion about r. □ 



Lemma 3.2 is all we will need in the next section, but the following refinement follows 
readily and takes us as far as we can go with the method of successive substitutions. 

Corollary 3.3. Let ip denote the characteristic function of a given v £ T. Then there 
exists a constant j3 = im — a € C with a > such that 

if>(t) = l + /3t + 0(t 2 ) as t [ 0. 

Proof. Combining (|2.1| )- (|2.2| ) and Lemma we readily obtain b(t) = 0(t 2 ). There- 
fore, by Proposition |2.2| , 

if)(t) - 1 = r {t) = (c - 2J)t + 2t [ ^-dv + b(t) =/3t + 0(t 2 ), 

Jv=0 v 

with (3 = im — a := c — 2 J. Since \ip(t)\ < 1 for all t, we must have a > 0. □ 

4 Proof of the main theorem 
4.1 Further preliminaries 

In Sections |4.1f -}4.2j we complete the proof of our main Theorem 1.1. To do this, we begin 



with a key result that any characteristic function with expansion as in Corollary |3.3 
[more generally, we allow the remainder term there to be simply o(t)] is in the domain 
of attraction of (iterates of) the "homogeneous" analogue To of T. (Here ==> denotes 
weak convergence of probability measures.) 

Theorem 4.1. Let ip be any characteristic function satisfying 

(4.1) tp{t) =l + fit+ o(t) = l+imt-at + o(t) as t j 

for some (3 = im — a £ C, with m £ R and a > 0. Let v be the corresponding probability 
measure. Then 

TqV => Cauchy(m, a), 

where Tq is the homogeneous analogue of the Quicksort transformation T mapping 
distributions as follows (in obvious notation): 

(4.2) T : Z i-> UZ+ (1 - U)Z*. 
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Proof. Let Z\, Zi,... ; Ui, U2, ■ ■ ■ be independent random variables, with every Z\ ~ v 
and every Uj ~ unif(0, 1). Then, using the definition of To repeatedly, 

2" 

Wn-J^V^Zi-Tgv, n>0, 
i=i 

where we define the random variables as follows. Using U\ in the obvious fashion, 
split the unit interval into intervals of lengths XJ\ and \ — U\. Now using U2 and f/3, split 
the first interval into subintervals of lengths U1U2 and Ui(l — U2) and the second interval 
into subintervals of lengths (1 — U\)U-z and (1 — U\)(l — C/3). Continue in this way (using 
U\, . . . , C/21-1) until the unit interval has been divided overall into 2 n subintervals. Call 
their lengths, from left to right, , . . . , vffl. 

Let L n := max(Vi n \ . . . , V 2 n )■ We show that L n converges in probability to as 

(n) 

n — ► 00. Luckily, the complicated dependence structure of the variables does not 

(n) 

come into play; the only observation we need is that that each marginally has the 

(n) 

same distribution as U\ ■ ■ ■ U n . Indeed, abbreviate V± as V n ; briefly put, we derive a 
Chernoff's bound for ln(l/V^) and then simply use subadditivity. To spell things out, 
let x > be fixed and let t > 0. Then 

n 

P(K > e~ x ) < e tx EV* = e tx J] EC/j = e tx (l + t)~ n = exp[-(nln(l +t) - xt)}. 

j'=i 

Choosing the optimal t = ^ — 1 (valid for n > x), this yields 

P(Vn > e _:E ) < exp[— (n ln(n/x) — n + x)] = exp[— (n(lnn — ln(ex)) + x)] 
and thus 

P(L n > e~ x ) < 2 n exp[-(n(lnn - ln(ex)) + x)] = exp[-(n(lnn - ln(2ex)) + x)] -> 
as n — ► 00. 

Since L n converges in probability to 0, we can therefore choose e n — > so that 
P(L n > e n ) — > 0. To prove the theorem, it then suffices to prove 

W n := l(L n < e n )W n Cauchy(m, a). 

For this, we note that the characteristic function 4> n of W n is given for t E R by 

(4.3) (j) n {t) = P(L n > e n ) + E 

We will show that 4> n (t) converges to = e tmt ~ ut for each fixed t > 0, and [since, 
further, 4> n (—t) = n (^)] this will complete the proof of the lemma. 

Indeed, we need only consider the second term in (|4.3| ). For that, the calculus 
estimates outlined in the proof of the lemma preceding Theorem 7.1.2 in O demonstrate 
that, when L n < e n , 

2 n 

n#/ B) *)=(i+A,)^ 



2 n 

l(L n <e n )Y[m- n} t) 

i=i 
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for complex random variables D n (depending on our fixed choice of t > 0) satisfying 
\D n \ < <5« for a deterministic sequence 5 n [ = 5(e n t)] — > [with <5(s) — ► as s — ► 0]. 
[Leaving out the error estimates, the argument is 



log 



Urn 



i=l 



£ ( w (n) ') - 1 



i=i 



/9t. 



It now follows easily that (j) n (t) —> e^, as desired. □ 
Both the next lemma and its immediate corollary (Lemma |4.3| ) will be used in our 



proof of Theorem 1.1 



Lemma 4.2. Let v% € J-, i = 1, 2. Suppose that (Z\, Z2) is a coupling of v\ and 1/2 such 
that the characteristic function of Z\ — Z2 satisfies ( fl.l[ ) . Then there exists a coupling 
{Z\,Z-i) of v\ and V2 such that Z\ — Z2 ~ Cauchy(m, a). 

Proof. Extend T to a transformation T2 on the class M.2 of probability measures on R 2 
by mapping the distribution £ € M2 of (X, Y) to the distribution T 2 £ of 

(UX + (1 - U)X* + g(U), UY+(1- U)Y* + g(U)), 

where U, (X,Y), and (X*,Y*) are independent, with (X,Y) ~ £, (X*,Y*) ~ f, and 
£7 ~ unif(0, 1), and where g is given by (O). (Note that we use the same uniform U for 
the Ys as for the Xs!) Of course, T2 maps the marginal distributions £i(-) = £(■ X R) 
of X and £ 2 (") = X •) of Y into T£i and T£ 2 , respectively; more importantly for our 
purposes, it maps the distribution, call it £, of X — Y into the distribution To£, with To 
defined at (pj~2|) . 

Now let v E A^2 have marginals Ui, i = 1, 2. Then (Tr5V) n >i has constant marginals 
(z/i, f 2 ) as n varies and so is a tight sequence. We then can find a weakly convergent 
subsequence, say, 

of course, the limit i^ 00 again has marginals z^, i = 1, 2. Moreover, 



But, by supposition, the characteristic function of v satisfies (4.1), so Theorem 4.1 
implies that v°° is Cauchy(m, a). Thus v°° G A4 2 supplies the desired coupling. □ 



Lemma 4.3. Let i>i £ J-, i = 1,2. Suppose that (Zi,Z 2 ) is a coupling of v\ and V2 
such that Z\ — Z2 has zero mean and finite variance. Then v\ = V2- □ 
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4.2 The proof 

We now complete the proof of Theorem 

Proof. As discussed in Section ||, it is simple to check that CCf (and that the elements 
of C are all distinct). 

Conversely, given v G T, let Z\ ~ v\ := v and Zi ~ V2 := \i be independent random 
variables (on some probability space); recall that \x is the limiting Quicksort measure, 
with zero mean and finite variance. Write tpi, i = 1,2, for the characteristic functions 
corresponding respectively to Ui, i = 1,2. By Lemma |3.2| (or see Corollary |3.3[), ifii 



satisfies (|4.lD [for some (m, cr)]. Of course, ip2 satisfies fl4.l|) with /3 taken to be 0, so the 
characteristic function t i— > il)i{t)ip2{—t) of Zi — Z2 satisfies ( |4.1[) for the same (m,o~) 
as for Applying Lemma |4.2j , there exists a coupling {jZ\,Z<i) of i^i and ^2 such 
that C := Z\ — Z2 ~ Cauchy(m, a). Without loss of generality (by building a suitable 
product space), we may assume the existence of a random variable Y ~ /i on the same 
probability space as Z\ and Z2 such that Y and C are independent. 

We know that the distribution v\ of Z\ = Z2 + C is a fixed point of T. But so is the 



distribution u[ £ C of Z := Y + C. By Lemma L3 applied to (Z±, Z), v = v\ = u[ E C 



as desired. □ 
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